Serveur d'exploration sur la musique en Sarre

Attention, ce site est en cours de développement !
Attention, site généré par des moyens informatiques à partir de corpus bruts.
Les informations ne sont donc pas validées.

Towards Timbre-Invariant Audio Features for Harmony-Based Music

Identifieur interne : 000467 ( Main/Exploration ); précédent : 000466; suivant : 000468

Towards Timbre-Invariant Audio Features for Harmony-Based Music

Auteurs : Meinard Müller [Allemagne] ; Sebastian Ewert [Allemagne]

Source :

RBID : Pascal:10-0137839

Descripteurs français

English descriptors

Abstract

Chroma-based audio features are a well-established tool for analyzing and comparing harmony-based Western music that is based on the equal-tempered scale. By identifying spectral components that differ by a musical octave, chroma features possess a considerable amount of robustness to changes in timbre and instrumentation. In this paper, we describe a novel procedure that further enhances chroma features by significantly boosting the degree of timbre invariance without degrading the features' discriminative power. Our idea is based on the generally accepted observation that the lower mel-frequency cepstral coefficients (MFCCs) are closely related to timbre. Now, instead of keeping the lower coefficients, we discard them and only keep the upper coefficients. Furthermore, using a pitch scale instead of a mel scale allows us to project the remaining coefficients onto the 12 chroma bins. We present a series of experiments to demonstrate that the resulting chroma features outperform various state-of-the art features in the context of music matching and retrieval applications. As a final contribution, we give a detailed analysis of our enhancement procedure revealing the musical meaning of certain pitch-frequency cepstral coefficients.


Affiliations:


Links toward previous steps (curation, corpus...)


Le document en format XML

<record>
<TEI>
<teiHeader>
<fileDesc>
<titleStmt>
<title xml:lang="en" level="a">Towards Timbre-Invariant Audio Features for Harmony-Based Music</title>
<author>
<name sortKey="Muller, Meinard" sort="Muller, Meinard" uniqKey="Muller M" first="Meinard" last="Müller">Meinard Müller</name>
<affiliation wicri:level="3">
<inist:fA14 i1="01">
<s1>Saarland University and the Max-Planck Institut für Informatik</s1>
<s2>66123 Saarbrücken</s2>
<s3>DEU</s3>
<sZ>1 aut.</sZ>
</inist:fA14>
<country>Allemagne</country>
<placeName>
<region type="land" nuts="2">Sarre (Land)</region>
<settlement type="city">Sarrebruck</settlement>
</placeName>
</affiliation>
</author>
<author>
<name sortKey="Ewert, Sebastian" sort="Ewert, Sebastian" uniqKey="Ewert S" first="Sebastian" last="Ewert">Sebastian Ewert</name>
<affiliation wicri:level="3">
<inist:fA14 i1="02">
<s1>Multimedia Signal Processing Group, Department of Computer Science III, Bonn University</s1>
<s2>53117 Bonn</s2>
<s3>DEU</s3>
<sZ>2 aut.</sZ>
</inist:fA14>
<country>Allemagne</country>
<placeName>
<region type="land" nuts="1">Rhénanie-du-Nord-Westphalie</region>
<region type="district" nuts="2">District de Cologne</region>
<settlement type="city">Bonn</settlement>
</placeName>
</affiliation>
</author>
</titleStmt>
<publicationStmt>
<idno type="wicri:source">INIST</idno>
<idno type="inist">10-0137839</idno>
<date when="2010">2010</date>
<idno type="stanalyst">PASCAL 10-0137839 INIST</idno>
<idno type="RBID">Pascal:10-0137839</idno>
<idno type="wicri:Area/PascalFrancis/Corpus">000009</idno>
<idno type="wicri:Area/PascalFrancis/Curation">000005</idno>
<idno type="wicri:Area/PascalFrancis/Checkpoint">000007</idno>
<idno type="wicri:explorRef" wicri:stream="PascalFrancis" wicri:step="Checkpoint">000007</idno>
<idno type="wicri:doubleKey">1558-7916:2010:Muller M:towards:timbre:invariant</idno>
<idno type="wicri:Area/Main/Merge">000467</idno>
<idno type="wicri:Area/Main/Curation">000467</idno>
<idno type="wicri:Area/Main/Exploration">000467</idno>
</publicationStmt>
<sourceDesc>
<biblStruct>
<analytic>
<title xml:lang="en" level="a">Towards Timbre-Invariant Audio Features for Harmony-Based Music</title>
<author>
<name sortKey="Muller, Meinard" sort="Muller, Meinard" uniqKey="Muller M" first="Meinard" last="Müller">Meinard Müller</name>
<affiliation wicri:level="3">
<inist:fA14 i1="01">
<s1>Saarland University and the Max-Planck Institut für Informatik</s1>
<s2>66123 Saarbrücken</s2>
<s3>DEU</s3>
<sZ>1 aut.</sZ>
</inist:fA14>
<country>Allemagne</country>
<placeName>
<region type="land" nuts="2">Sarre (Land)</region>
<settlement type="city">Sarrebruck</settlement>
</placeName>
</affiliation>
</author>
<author>
<name sortKey="Ewert, Sebastian" sort="Ewert, Sebastian" uniqKey="Ewert S" first="Sebastian" last="Ewert">Sebastian Ewert</name>
<affiliation wicri:level="3">
<inist:fA14 i1="02">
<s1>Multimedia Signal Processing Group, Department of Computer Science III, Bonn University</s1>
<s2>53117 Bonn</s2>
<s3>DEU</s3>
<sZ>2 aut.</sZ>
</inist:fA14>
<country>Allemagne</country>
<placeName>
<region type="land" nuts="1">Rhénanie-du-Nord-Westphalie</region>
<region type="district" nuts="2">District de Cologne</region>
<settlement type="city">Bonn</settlement>
</placeName>
</affiliation>
</author>
</analytic>
<series>
<title level="j" type="main">IEEE transactions on audio, speech, and language processing</title>
<title level="j" type="abbreviated">IEEE trans. audio speech lang. process.</title>
<idno type="ISSN">1558-7916</idno>
<imprint>
<date when="2010">2010</date>
</imprint>
</series>
</biblStruct>
</sourceDesc>
<seriesStmt>
<title level="j" type="main">IEEE transactions on audio, speech, and language processing</title>
<title level="j" type="abbreviated">IEEE trans. audio speech lang. process.</title>
<idno type="ISSN">1558-7916</idno>
</seriesStmt>
</fileDesc>
<profileDesc>
<textClass>
<keywords scheme="KwdEn" xml:lang="en">
<term>Acoustic signal</term>
<term>Cepstral analysis</term>
<term>Discriminant analysis</term>
<term>Feature extraction</term>
<term>Information retrieval</term>
<term>Invariance</term>
<term>Musical sound</term>
<term>Performance evaluation</term>
<term>Pitch(acoustics)</term>
<term>Signal processing</term>
<term>Sound analysis</term>
<term>State of the art</term>
<term>Timbre</term>
</keywords>
<keywords scheme="Pascal" xml:lang="fr">
<term>Analyse son</term>
<term>Son musical</term>
<term>Recherche information</term>
<term>Invariance</term>
<term>Analyse discriminante</term>
<term>Analyse cepstrale</term>
<term>Tonie</term>
<term>Evaluation performance</term>
<term>Etat actuel</term>
<term>Extraction caractéristique</term>
<term>Timbre</term>
<term>Signal acoustique</term>
<term>Traitement signal</term>
</keywords>
</textClass>
</profileDesc>
</teiHeader>
<front>
<div type="abstract" xml:lang="en">Chroma-based audio features are a well-established tool for analyzing and comparing harmony-based Western music that is based on the equal-tempered scale. By identifying spectral components that differ by a musical octave, chroma features possess a considerable amount of robustness to changes in timbre and instrumentation. In this paper, we describe a novel procedure that further enhances chroma features by significantly boosting the degree of timbre invariance without degrading the features' discriminative power. Our idea is based on the generally accepted observation that the lower mel-frequency cepstral coefficients (MFCCs) are closely related to timbre. Now, instead of keeping the lower coefficients, we discard them and only keep the upper coefficients. Furthermore, using a pitch scale instead of a mel scale allows us to project the remaining coefficients onto the 12 chroma bins. We present a series of experiments to demonstrate that the resulting chroma features outperform various state-of-the art features in the context of music matching and retrieval applications. As a final contribution, we give a detailed analysis of our enhancement procedure revealing the musical meaning of certain pitch-frequency cepstral coefficients.</div>
</front>
</TEI>
<affiliations>
<list>
<country>
<li>Allemagne</li>
</country>
<region>
<li>District de Cologne</li>
<li>Rhénanie-du-Nord-Westphalie</li>
<li>Sarre (Land)</li>
</region>
<settlement>
<li>Bonn</li>
<li>Sarrebruck</li>
</settlement>
</list>
<tree>
<country name="Allemagne">
<region name="Sarre (Land)">
<name sortKey="Muller, Meinard" sort="Muller, Meinard" uniqKey="Muller M" first="Meinard" last="Müller">Meinard Müller</name>
</region>
<name sortKey="Ewert, Sebastian" sort="Ewert, Sebastian" uniqKey="Ewert S" first="Sebastian" last="Ewert">Sebastian Ewert</name>
</country>
</tree>
</affiliations>
</record>

Pour manipuler ce document sous Unix (Dilib)

EXPLOR_STEP=$WICRI_ROOT/Wicri/Sarre/explor/MusicSarreV3/Data/Main/Exploration
HfdSelect -h $EXPLOR_STEP/biblio.hfd -nk 000467 | SxmlIndent | more

Ou

HfdSelect -h $EXPLOR_AREA/Data/Main/Exploration/biblio.hfd -nk 000467 | SxmlIndent | more

Pour mettre un lien sur cette page dans le réseau Wicri

{{Explor lien
   |wiki=    Wicri/Sarre
   |area=    MusicSarreV3
   |flux=    Main
   |étape=   Exploration
   |type=    RBID
   |clé=     Pascal:10-0137839
   |texte=   Towards Timbre-Invariant Audio Features for Harmony-Based Music
}}

Wicri

This area was generated with Dilib version V0.6.33.
Data generation: Sun Jul 15 18:16:09 2018. Site generation: Tue Mar 5 19:21:25 2024